Search CORE

17 research outputs found

Runtime Management of Multiprocessor Systems for Fault Tolerance, Energy Efficiency and Load Balancing

Author: Tzilis Stavros
Publication venue
Publication date: 01/01/2019
Field of study

Efficiency of modern multiprocessor systems is hurt by unpredictable events: aging causes permanent faults that disable components; application spawnings and terminations taking place at arbitrary times, affect energy proportionality, causing energy waste; load imbalances reduce resource utilization, penalizing performance. This thesis demonstrates how runtime management can mitigate the negative effects of unpredictable events, making decisions guided by a combination of static information known in advance and parameters that only become known at runtime. We propose techniques for three different objectives: graceful degradation of aging-prone systems; energy efficiency of heterogeneous adaptive systems; and load balancing by means of work stealing. Managing aging-prone systems for graceful efficiency degradation, is based on a high-level system description that encapsulates hardware reconfigurability and workload flexibility and allows to quantify system efficiency and use it as an objective function. Different custom heuristics, as well as simulated annealing and a genetic algorithm are proposed to optimize this objective function as a response to component failures. Custom heuristics are one to two orders of magnitude faster, provide better efficiency for the first 20% of system lifetime and are less than 13% worse than a genetic algorithm at the end of this lifetime. Custom heuristics occasionally fail to satisfy reconfiguration cost constraints. As all algorithms\u27 execution time scales well with respect to system size, a genetic algorithm can be used as backup in these cases. Managing heterogeneous multiprocessors capable of Dynamic Voltage and Frequency Scaling is based on a model that accurately predicts performance and power: performance is predicted by combining static, application-specific profiling information and dynamic, runtime performance monitoring data; power is predicted using the aforementioned performance estimations and a set of platform-specific, static parameters, determined only once and used for every application mix. Three runtime heuristics are proposed, that make use of this model to perform partial search of the configuration space, evaluating a small set of configurations and selecting the best one. When best-effort performance is adequate, the proposed approach achieves 3% higher energy efficiency compared to the powersave governor and 2x better compared to the interactive and ondemand governors. When individual applications\u27 performance requirements are considered, the proposed approach is able to satisfy them, giving away 18% of system\u27s energy efficiency compared to the powersave, which however misses the performance targets by 23%; at the same time, the proposed approach maintains an efficiency advantage of about 55% compared to the other governors, which also satisfy the requirements. Lastly, to improve load balancing of multiprocessors, a partial and approximate view of the current load distribution among system cores is proposed, which consists of lightweight data structures and is maintained by each core through cheap operations. A runtime algorithm is developed, using this view whenever a core becomes idle, to perform victim core selection for work stealing, also considering system topology and memory hierarchy. Among 12 diverse imbalanced workloads, the proposed approach achieves better performance than random, hierarchical and local stealing for six workloads. Furthermore, it is at most 8% slower among the other six workloads, while competing strategies incur a penalty of at least 89% on some workload

Chalmers Research

DeSyRe: on-Demand System Reliability

Author: Armato Antonino
Bouganis Christos-Savvas
Falsafi Babak
Gaydadjiev Georgi
Isaza Sebastian
Malek Alirad
Mariani Riccardo
Pnevmatikatos Dionisios N
Pradhan Dhiraj K
Rauwerda Gerard
Seepers Robert
Shafik Rishad Ahmed
Sourdis Ioannis
Strydis Christos
Sunesen Kim
Theodoropoulos Dimitris
Tzilis Stavros
Vavouras Michail
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

Southampton (e-Prints Soton)

EUR Research Repository

Chalmers Research

Chalmers Publication Library

Explore Bristol Research

Comparison of Psychological Distress between Type 2 Diabetes Patients with and without Proteinuria

Author: Armato A.
Bouganis C. S.
Falsafi B.
Gaydadjiev Georgi
Isaza S.
Malek Alirad
Mariani R.
Pagliarini S.N.
Pnevmatikatos Dionisios N.
Pradhan D.K.
Rauwerda G.K.
Seepers R.M.
Shafik R.A.
Smaragdos G.
Sourdis Ioannis
Strydis C.
Theodoropoulos D.
Tzilis Stavros
Vavouras M.
Publication venue: Okayama University Medical School
Publication date: 01/01/2014
Field of study

We investigated the link between proteinuria and psychological distress among patients with type 2 diabetes mellitus (T2DM). A total of 130 patients with T2DM aged 69.1±10.3 years were enrolled in this cross-sectional study. Urine and blood parameters, age, height, body weight, and medications were analyzed, and each patient’s psychological distress was measured using the six-item Kessler Psychological Distress Scale (K6). We compared the K6 scores between the patients with and without proteinuria. Forty-two patients (32.3%) had proteinuria (≥±) and the level of HbA1c was 7.5±1.3%. The K6 scores of the patients with proteinuria were significantly higher than those of the patients without proteinuria even after adjusting for age and sex. The clinical impact of proteinuria rather than age, sex and HbA1c was demonstrated by a multiple regression analysis. Proteinuria was closely associated with higher psychological distress. Preventing and improving proteinuria may reduce psychological distress in patients with T2DM

Crossref

Southampton (e-Prints Soton)

EUR Research Repository

Chalmers Research

Okayama University Scientific Achievement Repository

Explore Bristol Research

DeSyRe: On-demand system reliability

Author: Armato A.
Bouganis Christos Savvas
Falsafi Babak
Gaydadjiev Georgi Nedeltchev
Isaza Sebastian
Malek Alirad
Mariani R.
Pnevmatikatos Dionisios N.
Pradhan Dhiraj K.
Rauwerda Gerard K.
Seepers Robert M.
Shafik Rishad A.
Sourdis Ioannis
Strydis Christos
Sunesen Kim
Theodoropoulos Dimitris
Tzilis Stavros
Vavouras Michalis
Publication venue: 'Elsevier BV'
Publication date: 26/02/2014
Field of study

The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect-/fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints. (C) 2013 Elsevier B.V. All rights reserved

Infoscience - École polytechnique fédérale de Lausanne

Graceful Degradation of Adaptive Multiprocessor Systems on a Chip

Author: Tzilis Stavros
Publication venue
Publication date: 01/01/2015
Field of study

This thesis explores the potential for using existing flexibility in order to allow Multiprocessor Systems on a Chip to function in the presence of permanent faults and to prolong their lifetime. Technology scaling in accordancewith Moore’s law brings up reliability challenges and forces the use of potentially unreliable hardware components. However, hardware reconfigurabilityand workload flexibility can provide the means for permanent fault tolerance and graceful degradation via runtime system management. This work first elaborates on the concept of degradable hardware components and presentsa methodology for characterizing each of their possible configurations. This is necessary if we intend to use this reconfigurability in an efficient manner to work around permanent faults. The characterization methodology is used to perform design space exploration aiming to find the optimal reconfiguration granularity for any given fault density. Subsequently, Graceful Degradation of Multiprocessor Systems on a Chip is defined as an optimization problem. Three algorithms are proposed for solving this problem within reasonable time, in order to be applicable at runtime: A novel fast heuristic based onincremental and partially precomputed solutions, and our versions of two well established search algorithms (simulated annealing and genetic algorithm), tailored to the particular problem. The fast heuristic is proven able to find a solution on average 81.9% as good as the exhaustively sought optimal one, in less than 2μsec on average. Our versions of simulated annealing and geneticalgorithm find on average better solutions (86.6% and 89.6% as good as the optimal respectively), at the cost of one and two orders of magnitude slower execution time

Chalmers Research

A runtime manager for gracefully degrading SoCs

Author: Sourdis Ioannis
Tzilis Stavros
Publication venue
Publication date: 01/01/2014
Field of study

The increasing number of transistors integrated on a single chip comes with the blessing of raw computational power and the curse of susceptibility to various kinds of faults. On top of increased defect densities, wearout effects mean that the testing verdict at fabrication time cannot be trusted throughout the chip lifetime. However, extra computational power presents the opportunity to build gracefully degrading MPSoCs. Re-configurable components and flexible workloads, along with runtime support, enable MPSoCs to deal with permanent faults degrading one or more system aspects, such as performance, energy efficiency and delivered functionality, instead of failing. In this manner, chip life is prolonged and safety is increased. In this work Graceful Degradation (GD) is formulated as an optimization problem in the context of MPSoCs. As such, its possible solutions can be evaluated in a parameterizable and consistent manner. An attempt at a runtime solution for a heterogeneous 4-core SoC is made and the resulting GD manager is evaluated in terms of speed and accuracy, with a use case combining essential automotive tasks and non-essential additional features. On average, it is found to produce a solution 89% as good as the optimal, in 4.3μsec running on one core of a common modern CPU

Crossref

Chalmers Research

Chalmers Publication Library

The DeSyRe runtime support for fault-tolerant embedded MPSoCs

Author: Pnevmatikatos D.N.
Sourdis Ioannis
Tzilis Stavros
Publication venue
Publication date: 01/01/2014
Field of study

Semiconductor technology scaling makes chips moresensitive to faults. This paper describes the DeSyRe designapproach and its runtime management for future reliable embedded Multiprocessor Systems-on-Chip (MPSoCs). A light weight runtime system is described for shared-memory MPSoCs to support fault-tolerant execution upon detection of transient and permanent faults. The DeSyRe runtime system offers re-execution of tasks that suffer from transient faults and task-migration in cases where a worker processor is permanently faulty. In addition, a faulty worker can potentially remainusable, increasing systems fault-tolerance. This is achieved using alternative task implementations, which avoid the faulty circuit and are indicated in the application-code via pragma annotations, as well as by repairing a faulty core via hardware reconfiguration. Thereby, the system can be dynamically adapted using one ormultiple of the above mechanisms to mitigate faults. The DeSyReruntime system is evaluated using micro-benchmarks running ona Virtex-6 FPGA MPSoC. Results suggest that our enhance dfault-tolerant runtime system can successfully and efficiently execute all application tasks under a variety of fault cases

Chalmers Research

Chalmers Publication Library

RQNoC: A resilient quality-of-service network-on-chip with service redirection

Author: He Yifan
Malek Alirad
Rauwerda G.K.
Sourdis Ioannis
Tzilis Stavros
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

In this article, we describe RQNoC, a service-oriented Network-on-Chip (NoC) resilient to permanent faults. We characterize the network resources based on the particular service that they support and, when faulty, bypass them, allowing the respective traffic class to be redirected. We propose two alternatives for service redirection, each having different advantages and disadvantages. The first one, Service Detour, uses longer alternative paths through resources of the same service to bypass faulty network parts, keeping traffic classes isolated. The second approach, Service Merge, uses resources of other services providing shorter paths but allowing traffic classes to interfere with each other. The remaining network resources that are common for all services employ additional mechanisms for tolerating faults. Links tolerate faults using additional spare wires combined with a flit-shifting mechanism, and the router control is protected with Triple-Modular-Redundancy (TMR). The proposed RQNoC network designs are implemented in 65nm technology and evaluated in terms of performance, area, power consumption, and fault tolerance. Service Detour requires 9% more area and consumes 7.3% more power compared to a baseline network, not tolerant to faults. Its packet latency and throughput is close to the fault-free performance at low-fault densities, but fault tolerance and performance drop substantially for 8 or more network faults. Service Merge requires 22% more area and 27% more power than the baseline and has a 9% slower clock. Compared to a faultfree network, a Service Merge RQNoC with up to 32 faults has increased packet latency up to 1.5 to 2.4 7 and reduced throughput to 70% or 50%. However, it delivers substantially better fault tolerance, having a mean network connectivity above 90% even with 32 network faults versus 41% of a Service Detour network. Combining Serve Merge and Service Detour improves fault tolerance, further sustaining a higher number of network faults and reduced packet latency

Chalmers Research

SWAS: Stealing Work Using Approximate System-Load Information

Author: Pericas Miquel
Petersen Moura Trancoso Pedro
Sourdis Ioannis
Tzilis Stavros
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

This paper explores the potential of utilizing approximate system load information to enhance work stealing for dynamic load balancing in hierarchical multicore systems. Maintaining information about the load of a system has not been extensively researched since it is assumed to introduce performance overheads. We propose SWAS, a lightweight approximate scheme for retrieving and using such information, based on compact bit vector structures and lightweight update operations. This approximate information is used to enhance the effectiveness of work stealing decisions. Evaluating SWAS for a number of representative scenarios on a multi-socket multi-core platform showed that work stealing guided by approximate system load information achieves considerable performance improvements: up to 18.5% for dynamic, severely imbalanced workloads; and up to 34.4% for workloads with complex task dependencies, when compared with random work stealing

Crossref

Chalmers Research

Chalmers Publication Library

Energy-efficient Runtime Management of Heterogeneous Multicores using Online Projection

Author: Petersen Moura Trancoso Pedro
Sourdis Ioannis
Tzilis Stavros
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Heterogeneous multicores offer flexibility in the form of different core types and Dynamic Voltage and Frequency Scaling (DVFS), defining a vast configuration space. The optimal configuration choice is not always straightforward, even for single applications, and becomes a very difficult problem for dynamically changing scenarios of concurrent applications with unpredictable spawn and termination times and individual performance requirements. This article proposes an integrated approach for runtime decision making for energy efficiency on such systems. The approach consists of a model that predicts performance and power for any possible decision and low-complexity heuristics that use this model to evaluate a subset of possible decisionsto choose the best. The model predicts performance by projecting standalone application profiling data to the current status of the system and power by using a set of platform-specific parameters that are determined only once for a given system and are independent of the application mix. Our approach is evaluated with a plethora of dynamic, multi-application scenarios. When considering best effort performance to be adequate, our runtime achieves on average 3% higher energy efficiency compared to the powersave governor and 2 7 better compared to the other linux governors. Moreover, when also considering individual applications’ performance requirements, our runtime is able to satisfy them, giving away 18% of the system’s energy efficiencycompared to the powersave, which, however, misses the performance targets by 23%; at the same time, our runtime maintains an efficiency advantage of about 55% compared to the other governors, which also satisfy the performance constraints

Chalmers Research